Automatic Overheads Profiler for OpenMP Codes

نویسنده

  • M. K. Bane
چکیده

To develop a good parallel implementation requires understanding of where run-time is spent and comparing this to some realistic best possible time. We introduce “overhead analysis” as a way of comparing achieved performance with achievable performance. We present a tool, OVALTINE, which aims to provide, automatically, the user with a hierarchical set of overheads for a given OpenMP implementation with respect to a given serial implementation. We give preliminary results from OVALTINE on an SGI Origin2000, and show how our tool can be used to improve performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications

Many different sources of overheads impact the efficiency of a scheduling strategy applied to a parallel loop within a scientific application. In prior work, we handled these overheads using multiple loop scheduling strategies, with each scheduling strategy focusing on mitigating a subset of the overheads. However, mitigating the impact of one source of overhead can lead to an increase in the i...

متن کامل

OpenMP benchmark using PARKBENCH

Real application codes in OpenMP obviously measure the performance of OpenMP programming on the real problems. Although this is ultimately what the end-user wants, the full real applications are often complex and large. In order to obtain a guide to the performance of OpenMP parallel programs in any given parallel systems, kernel and synthetic benchmarks are useful. PARKBENCH[4] is a set of ben...

متن کامل

Performance Analysis of Shared-Memory Parallel Applications Using Performance Properties

Tuning parallel code can be a time-consuming and difficult task. We present our approach to automate the performance analysis of OpenMP applications that is based on the notion of performance properties. Properties are formally specified in the APART specification language (ASL) with respect to a specific data model. We describe a data model for summary (profiling) data of OpenMP applications a...

متن کامل

Experiences with OpenMP in tmLQCD

An overview is given of the lessons learned from the introduction of multi-threading using OpenMP in tmLQCD. In particular, programming style, performance measurements, cache misses, scaling, thread distribution for hybrid codes, race conditions, the overlapping of communication and computation and the measurement and reduction of certain overheads are discussed. Performance measurements and sa...

متن کامل

Reparallelization techniques for migrating OpenMP codes in computational grids

Typical computational grid users target only a single cluster and have to estimate the runtime of their jobs. Job schedulers prefer short-running jobs to maintain a high system utilization. If the user underestimates the runtime, premature termination causes computation loss; overestimation is penalized by long queue times. As a solution, we present an automatic reparallelization and migration ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000